Billion-Node Graph Challenges

نویسندگان

  • Yanghua Xiao
  • Bin Shao
چکیده

Graph is a universal model in big data era and finds its wide applications in a variety of real world tasks. The recent emergence of big graphs, especially those with billion nodes, poses great challenges for the effective management or mining of these big graphs. In general, a distributed computing paradigm is necessary to overcome the billion-node graph challenges. In this article, we elaborate the challenges in the management or mining on billon-node graphs in a distributed environment. We also proposed a set of general principles in the development of effective solutions to handle billion-node graphs according to our previous experiences to manage billion-node graphs. The article is closed with a brief discussion of open problems in billion-node graph management.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward a Distance Oracle for Billion-Node Graphs

The emergence of real life graphs with billions of nodes poses significant challenges for managing and querying these graphs. One of the fundamental queries submitted to graphs is the shortest distance query. Online BFS (breadth-first search) and offline pre-computing pairwise shortest distances are prohibitive in time or space complexity for billion-node graphs. In this paper, we study the fea...

متن کامل

Centralities in Large Networks: Algorithms and Observations

Node centrality measures are important in a large number of graph applications, from search and ranking to social and biological network analysis. In this paper we study node centrality for very large graphs, up to billions of nodes and edges. Various definitions for centrality have been proposed, ranging from very simple (e.g., node degree) to more elaborate. However, measuring centrality in b...

متن کامل

Topology Control in Wireless Sensor Network using Fuzzy Logic

Network sensors consist of sensor nodes in which every node covers a limited area. The most common use ofthese networks is in unreachable fields.Sink is a node that collects data from other nodes.One of the main challenges in these networks is the limitation of nodes battery (power supply). Therefore, the use oftopology control is required to decrease power consumption and increase network acce...

متن کامل

Thinking Like a Vertex: a Survey of Vertex-Centric Frameworks for Distributed Graph Processing

The vertex-centric programming model is an established computational paradigm recently incorporated into distributed processing frameworks to address challenges in large-scale graph processing. Billion-node graphs that exceed the memory capacity of standard machines are not well-supported by popular Big Data tools like MapReduce, which are notoriously poor-performing for iterative graph algorit...

متن کامل

An SSD-based eigensolver for spectral analysis on billion-node graphs

Many eigensolvers such as ARPACK and Anasazi have been developed to compute eigenvalues of a large sparse matrix. These eigensolvers are limited by the capacity of RAM. They run in memory of a single machine for smaller eigenvalue problems and require the distributed memory for larger problems. In contrast, we develop an SSD-based eigensolver framework called FlashEigen, which extends Anasazi e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Data Eng. Bull.

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2017